
Australian Government 
Department of Home Affair* 


Major Incident Review 


J - Compromised - Cargo, Traveller and other Border Systems unavailable - 
Monday 29 th April, 2019 


Priority: 


PI 


6 hours 47 minutes 


Outage Period 


Description: 


From 0558hrs on Monday 29th April 2019 users reported being unable to access a range of systems, including s •'* 7E(ll, 

. Users were 

prompted with 'Cannot reach this page'. 

A major incident was declared and a number of technical teams engaged to investigate, this was managed via a Service 
Restoration Team. 

The cause of the incident was identified as a hardware failure, specifically a Line Card on Network Distribution Switch 1 at 
the s -47EM) Data Centre. To restore services, the faulty card was removed and minor patch configurations 

performed. This action restored services and confirmation was received from various business areas including airports. 


A restart of JVM's fors. 47E(d) 
restarts are needed. 


were required. This was likely due to current known errors for which regular 


Services were restored from a technical perspective at 1115hrs. Final confirmation of restoration was received from 
business areas at 1244hrs, some local reboots of Smartgates were required to trigger connections to the network. 

A 24 hour period of monitoring was undertaken to ensure stability of services and there were no issues raised during this 
time. The incident was resolved and closed at 1500hrs on Tuesday 30 April 2019. 


Actions for this incident needed to 


There was also an impact on active major incident fors. 47E(d) references.47E(d) 
be suspended as this outage limited the ongoing investigations. 


Business Impact: 


An unscheduled outage of multiple IT systems occurred; this includeds. 47E(d) 


This outage is resulted in a risk to national security, trade enforcement, migrations system, and border protection. 
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This outage affected processing times at international airports, and was subject to open source media reporting. 

Processing of international cargo was halted due to redirecting ABF resources to process passengers manually for the 
duration of the outage. 


Ongoing activities: 


During this incident, it was identified that a configuration issue prevented failover to a second switch. A prdactjve^ 
major incident was raised on Wednesday 1 May because of confirmation of a lack of failover/redundanj:yr An 
emergency change was successfully completed at 0030hrs on Thursday 2 May to configure the second switch ter 
enable failover. 

A Major Incident Review was completed on Friday 3 May and this has identified a number of points for clarification 
and immediate next steps. 

IBM have prepared a risk assessment and recommendation to action the replacement of faulty hardware 
complete a test of the failover. ^ 

Failover testing of the redundancy put in place under the above mentioned P2 is being under a PI Problem Fecorct 
PM4001051. This will be planned and agreed with ABF. 
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Identification: 


0558hrs - First call regarding outage of National Intelligence System received at the IT Service Desk 
0600hrs — Service Desk advised Unisys MIM and called through to IBM Service Desk for report of P3. 

0638hrs -s. 22(1 )(a)(ii) calleds. 22(1 )(a)(ii) to advise that Arrivals-Departures/s-47E(d) were all unavailable. 

■.22(1Xa)(i») then contacted Unisys MIM and advised that a Major Incident should be raised based on the 

business impact. 

0706hrs - Pageout for P2 was then sent at 07:10hrs (07:06hrs raised) by Unisys MIM. 

0730hrs - Major Incident escalated as a PI. 


Prioritisation: 


Major Incident was initially raised as a P2 at 07:09hrs and upgraded to a PI at 07:30hrs. 

Noting the significant impact to business, a Service Restoration Team was stood-up. In total, 5-checkpoint 
teleconference dial-ins were conducted throughout the day while Major Incident in-flight. 


Investigation and diagnosis 


0745hrs - ICT WIO Technician identified a number of servers that were down. At this point, the issues were isolated 
to IBM. 

0832hrs- IBM carried out health checks and stage investigation. 

0841hrs - Verified that a Line Card was down on the Network Distribution Switch 1 at the 8 47E < d > and the assumed 
failover did not occur to Switch 2. Although traffic was moving across to Switch 2, it was not moving out to the WAN. 


Resolution activities 


IBM performed the following activities in an attempt to failover to Switch 2: 

0855hrs - Shut down the VLANS to Switch 1 so failover to Switch 2 could occur. The VLANS were seen on Switch 2, 
however, there was no traffic flowing through Switch 2 to the Wide Area Network (WAN). 

1009hrs - Agreed to turn off Switch 1 to force a physical failover to Switch 2 just in case there was some corruption 
in Switch 1 preventing the failover from occurring. Once switch 1 was turned off, Switch 2 could see the VLANs same 
as before, however, there was still no WAN network traffic. 

1105hrs — Given the continuing issues with Switch 2 not failing over. Switch 1 was restarted again and services were 
moved from the failed Line Card to another working line card on Switch 1, which involved physically moving cables 
from the failed Line Card to a working Line Card on Switch 1. When the relocation of the cables was complete, traffic 
started to flow on Switch 1. This resolved the issue and allowed traffic to flow over the WAN. 

- 1115hrs - Majority of airports reported restoration of services, with the remaining resolved by following SOPs and 

performing system restarts. 


Resolution confirmation 
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All impacted systems and services were coming back on-line around 1115hrs and associated checks and balances vyere 
being undertaken by business to confirm services restored. 

Traveller & Cargo needed to perform a restart of JVM fors.47E(d) 

Restart of s. 47E(d) fixed some issues, however, full functionality of this was dependant on an IBM fix that 
be deployed into PROD later that evening. 

IT and business stakeholders advised all affected systems were back up and running by 1245hrs and will continue ttf_ 
monitor. All systems appeared to be stable by 1457hrs. _ 


Incident monitoring period 



Major Incident was placed into monitoring for a further 24 hours to confirm stability. At 1500hrs the next day, MajojC 
Incident was placed into a resolved state and closed. 


Immediate post incident action 


Requested Post Incident Reports from IBM and Unisys. 

Daily Major Incident Summary: 01 May 2019 provided to Senior Executive. 


Action items from this review 
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IBM to draft a risk assessment as part of their implementation plan in relation to the above-mentioned P3 Incident^. 
Unisys MIM to confirm other reports of Incidents called through to the IT Service Desk between 0630-0710hrs; 
Unisys MIM to update consolidated Unisys PIR i.e. what times were the Teleconferences held, and when updates 
were provided to the broader group; 

IBM to provide update on when they contacted the Service Desk about Citrix issue; 
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IBM, Cargo & Trade, ABF Operations Systems Management, s. 22(1 )(a)(ii) to review Cargo 

Processing between 1114-1230hrs; 

ICT SM to have a discussion with ABF Operations Systems Management to understand the greater impact of the 
issues; 

- Change Management to review the time-frame of lodged change post-incident and the change process; 

IBM to advise what existing monitoring system triggered an alert for the line card failure and what methods are in 
place for IBM to receive alerts when access is unavailable via Citrix. 

IBM to perform failover testing in PROD, once the routing change is deployed there. 

Problem Ticket to be raised to investigate failover of Smartgate Arrivals and Departures in similar scenarios; and 
IBM Problem Management and Network Assurance to assess changes put through for the lower environments, with 
an idea to move into prod - change records required to validate time frames. 
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MIR Attendees 


Name 



Title 


Optus 

ICT Service Management Director 
IBM 

ICT Service Management 
IBM 

ICT Service Management 
ICT Service Management 
ICT Vendor Management 
Vendor & MOU Management 
IBM 

ICT Service Management 

Vendor & MOU Management 

ICT Service Management 

ICT Service Management 

Unisys 

Unisys 

Unisys 

ICT Service Management 

IBM 

Wintel 

ABF Operations Systems 
Traveller Systems 
Traveller Systems 
Network Operations & Assurance 
Network Operations & Assurance 
ABF Operations Systems 
Cargo & Trade Systems 


For further information in relation to the incident or the PIR corrective actions, please refer to the attac 


1 — PIR (IBM input) 



IBM PIR - PI - 
IM4576792 29th Apr 


2-PIR (Unisys) 



Unisys 

Consolidated PIR - F 


Released by Department of Home Affirs 
under the Freedom of Information Act 19S2 

















incident Summary: 


Post Incident Report 


Brief Incident description: 

Multiple Systems Down - Prompting 'Cannot reach this page' 

Incident Number: 

s. 47E(d) 

External Reference Numbers: 

IN1699357 

Priority Level: 

Priority 1 

PIR Author: 

s. 22(1 )<aKii) 

Actual downtime of Incident: 

4 hours 7 minutes 

PIR Review & Input provided 

by: (Org/Name/Section) 

IBM Network and Service 

Delivery manager 

Date/Time of Incident Recorded: 

29/04/2019 07:06 

Resolving Group: 

IBM - Network Services 

Date/Time of Incident Resolution: 

29/04/2019 11:13 

No. of Users affected: 

TBA 

Environment impacted: 

E9 - Production 

Related change number if 
applicable: (please provide 
number and title) 

N/A 

PIR Request Date: (If P2/3/4 Incident) 

N/A 

Method of Detection: 

Staff 

Service / System / Device affected: 

s. 47E(d) 

Geographic Location: 

Australia 

Actual Business impact: 

TBA - to be provided by the Department 

Business and System Owner: 

TBA 


Incident Timeline & Resolution: 
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What steps were performed to restore 
services? 

What specific action was taken to 
restore services? 


Was Root Cause identified during the 
incident? If so please provide detail. 


7:15 IBM were engaged. IBM technical support (Unix, Web Support) teams investigated. 

7:45 IBM Unix confirmed servers were up but unable to logon directly from Putty. IBM Web Support 
unable to connect to Smartgate servers. 

Users unable to access s - 47E < d > 

7:49 IBM Unix advised all data network seems down, servers accessible only by admin network. VIOS 
shows that physical links are down. 

7:57 IBM Network team engaged. 

08:16 IBM DCMS at 8 ' 47E * d * Data Centre were engaged to inspect core switch. 

08:32 IBM Network team advised the link to the edge switch to Distribution switch is down 

08:41 IBM DCMS performed physical inspection and confirmed Line Card 9 is down on the primary 
s.47E(d) Qj s tnb u ti on Switch 01. 

08:54 IBM Network team: An automatic failover of Distribution Switch 01 to Switch 02 should have 
occurred. As the failover was unsuccessful, the following actions were made to force the traffic to 
Switch 02 manually, but was also unsuccessful. 

Shutdown the vlans on the Distribution Switch 01 
Shut all access connections to Distribution Switch 01 


09:50 IBM DCMS team were requested by the Network Team to power off 


s. 
47E(d) 


Distribution Switch 


01 to force the failover to Switch 02 as there could be some open connections stopping the failover 
from occurring. IBM Network team confirmed on Distribution Switch 02 that all vlans were active, 
however no routes were being learned on Distribution Switch 02 from the Optus WAN - therefore still 
no traffic flow. 

10:10 IBM Network team arranged hardware call for Distribution Switch 01. 

10:20 IBM DCMS restarted Distribution Switch 01 and re seated the Line Card, however the Line Card 
9 was still red. 

10:49 IBM identified another Line Card in Distribution Switch 01 that could be used. 

11:05 IBM DCMS repatched the connection (Edge switch to Distribution switch 01) from Line Card 9 
(faulty) to Line Card 8 (functioning). IBM Network team configured the port and enabled it. 

11:13 Service restored. Network logs verify the repatched DS01 was back up at April 29,11:13:27 


11:14 Mainframe log entry confirms the mainframe communication restored 
AC01 2019119 11:14:29.05 T0227565 00000090 $HASP100 DCDLE2 ON TSOINRDR 




Root cause was a Hardware Failure on 8 ^^ Distribution Switch 1, which connects Distribution Swit:h 
1 to the Edge Switch. 

Network traffic should have failed over from Distribution Switch 1 to Distribution Switch 2 fc>i 
the Edge Switch. This failover did not occur successfully because the static route was missing on tile 


Edge Switch to route traffic to and from Distribution Switch 2. 
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Was the resolution temporary or 
permanent? 


Temporary. 


& 


Has the issue occurred before? Are there 
any noticeable trends or Patterns? 


No. 
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What is the likelihood of this issue re¬ 
occurring? 


Low. 


















What stakeholders need to be engaged 
for Root Cause Analysis (include any 
Business Reps that should be included)? 

IBM Network team have identified the root cause as above. 

Is Incident related to an existing 

Problem or Known Error? 

No. 

Reference Number: 


Service Improvement Activities: 

Have any service improvement activities 
been identified. If so please provide 
detail? 

Short Term: 

Faulty hardware (Line Card 9 ofg^J Distribution Switch 01) needs to be replaced. 

Vlans and access ports need to be reactivated on ^ 7E(d) Distribution Switch 01. 

Once the faulty hardware has been replaced, repatch cables back from Line Card 8 
(functional) to Line Card 9 (currently down) of^^ Distribution Switch 01. 

The static route of Edge Switch should be added for*^^ Distribution Switch 02. 

Complete fail over testing of®:,^ Distribution Switch 

4fb(a) 
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incident Summary: 


Post Incident Report 


Brief Incident description: 

Multiple Systems Down - Prompting 'Cannot reach this page' 

Incident Number: 

s. 47E(d) 

External Reference Numbers: 

IN1699357 

Priority Level: 

Priority 1 

PIR Author: 

s.22(1XaXii) _, BM 

Hi | MIM 

Actual downtime of Incident: 

4 hours 7 minutes 

PIR Review & Input provided 

by: (Org/Name/Section) 

IBM Network and Service 

Delivery manager 

Date/Time of Incident Recorded: 

29/04/2019 07:06 

Resolving Group: 

IBM - Network Services 

Date/Time of Incident Resolution: 

29/04/2019 11:13 

No. of Users affected: 

TBA 

Environment impacted: 

E9 - Production 

Related change number if 
applicable: (please provide 
number and title) 

N/A 

PIR Request Date: (If P2/3/4 Incident) 

N/A 

Method of Detection: 

Staff 

Service / System / Device affected: 

S.47E«1) 

Geographic Location: 

Australia 

Actual Business impact: 

TBA —to be provided by the Department 

Business and System Owner: 

TBA 


incident Timeline & Resolution: 
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What steps were performed to restore 
services? 

What specific action was taken to 
restore services? 


07:00- Major Incident Manager (MIM) engaged WIO. 

07:06-MIM raised priority 2 incident. 

07:10- MIM engaged Optus Technician to investigate. 

07:15 - MIM engaged IBM. 

07:15 IBM were engaged. IBM technical support (Unix, Web Support) teams investigated. 

07:18-Optus engaged MIM to advise their SNOC have performed checks and were unable to see any 
alerts. 

07:20- IBM engaged MIM to advise of their reference number S ' 47E ^ 

07:25 - VBOX on call engaged MIM to regards to the issue. 

07:30 - MIM raised incident from a priority 2 to a priority 1 due to the impact to Smartgates and 
border systems. 

07:30 - IBM engaged IBM Web Posting to investigate. 

07:30- MIM contacted WIO who advised a technician was still investigating and to contact Secure 
Gateway Services (SGS). 

07:44- MIM attempted to contact SGS on call - no answer, left a voicemail. 

07:45 - MIM WIO Technician noticed a number of servers down and were working to restore them. 
07:45 - IBM Unix team confirmed servers were up but unable to logon directly from Putty. IBM Web 
Support unable to connect to Smartgate servers. 

Users unable to access s ' 47E(d) 

07:49 - IBM Unix team advised all data network seems down, servers accessible only by admin 
network. VIOS shows that physical links are down. 

07:57 IBM Network team engaged. 

08:04 - IBM Unix team advised WAN (supported by Optus), AIX Solaris, Intel servers are all affected, 
not just AIX. However, further investigation was ongoing to confirm. 

08:10 - IBM were unable to connect via jumpbox to perform checks which meant a core switch or 
WAN was down. 

08:16 - IBM DCMS at s 47E(d) Data Centre were engaged to inspect core switch. 

08:30- IBM Unix confirmed they could not ping either of the Airport servers from their Home Affairs 
PC. 

08:31- IBM Network team managed to log in and check switches: 

Confirmed Optus link was back up 
Confirmed Edge switch was back up 


■ IBM Network team advised the link to the edge switch to Distribution switch is down. 

■ IBM engaged MIM to contact Optus to confirm WAN connectivity and also check the link to 


MIM engaged Optus on call to confirm WAN connectivity and to check the link to 


s. 
47E(d) 


Optus 


) in currently reachable across the WAN. Optus 


and from s.47E(d) 


Latency from 


08:32 

08:34 

s 

47E(d) 

08:35 

advised the core switch (last hop in to 
have performed ping tests from 

DC core switch were stable overnight. 

08:41 - IBM DCMS performed physical inspection and confirmed Line Card 9 is down on the primary 
Distribution Switch 01. 

08:42 - MIM scheduled a Service Restoration Team (SRT) meeting at 09:00 with the relevant r e so l v er 
teams attending. C\i 

08:54 - IBM Network team: An automatic failover of Distribution Switch 01 to Switch 02 should have 
occurred. As the failover was unsuccessful, the following actions were made to force the traffic t{3 
Switch 02 manually, but was also unsuccessful. 

Shutdown the vlans on the Distribution Switch 01 
Shut all access connections to Distribution Switch 01 
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09:00 - SRT 1 commenced 

• Optus Technicians notice core switch traffic started to drop at 05:45 from the dqmpn 
controller. 

• WIO Technicians advised they are unable to see anything from the 8 - 
centre. 

• Network switch should have failed over but did not, IBM are trying to fail over ti 
switch 2. 
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09:50-IBM DCMS team were requested by the NetworkTeam to power off Distribution Switch 
01 to force the failover to Switch 02 as there could be some open connections stopping the fallout 
from occurring. IBM Network team confirmed on Distribution Switch 02 that all vlans were kctivqj 
however no routes were being learned on Distribution Switch 02 from the Optus WAN -thecefqrp still 
no traffic flow. 


0 

V) 

ra 

0 

0 

on 


xz 


0 

c 

3 













Was Root Cause identified during the 
incident? If so please provide detail. 


Was the resolution temporary or 
permanent? 


09:57 - IBM Network team advised there are no routes were being learned on DS02 from the outside 
and were currently troubleshooting the issue. 

10:02 - IBM escalated with Telstra to define the next course of action 
10:10 — IBM Network team arranged hardware call for Distribution Switch 01. 

10:20 - IBM DCMS restarted Distribution Switch 01 and re seated the Line Card, however the Line 
Card 9 was still red. 

10:26-IBM Unix confirmed so far AIX server/CBR00011 and application are all good on OS 

4r t(Q) 

health check. Only external connectivity is an issue. 

10:30 - IBM Unix confirmed the switch configuration is fine, however traffic is not routing through. 
10:33 - IBM were in progress of restarting DS01 switch 

10:38-Telstra Leadership engaged Hardware Support and Cisco for assistance. 

10:39 - DPE engaged Optus for assistance with traffic flows. 

10:40 - IBM Network team advised they were investigating the Line Card still being red after DCMS 
restart. 

10:49 - IBM identified another Line Card in Distribution Switch 01 that could be used. 

11:04 - Optus advised they could see direct connections but were investigating as they could see no 
routes. 

11:05 - IBM DCMS repatched the connection (Edge switch to Distribution switch 01) from Line Card 9 
(faulty) to Line Card 8 (functioning). IBM Network team configured the port and enabled it. 

11:13 - Service restored. Network logs verify the repatched DS01 was back up at April 29,11:13:27 
11:14 — Mainframe log entry confirms the mainframe communication restored 
11:16 - IBM advised DS01 was back up, however DS02 was still the primary. 

11:17- IBM Unix confirmed they were able to connect directly to servers. 

11:18 - Optus advised they were seeing a significant amount of data flowing through. 

11:19 — IBM advisedreplication is not starting to flow through and backlog was decreasing. 
11:21 — IBM advised they were in the process of reactivating the VLANs on DS01. 

11:27 - DPE notified client * |) 22,1Ka) that services may have been restored but still pending 
confirmation from end users. 

11:29 - VBOX brought up a gate in Sydney and restarted a GTM 

11:30- IBM Web Hosting team advised Smartgate health check had come back all good. 

11:33 - MORPHO brought up their gates. All arrival gates were operating in Melbourne. 

11:49 —IBM confirmed all Oracle Databases had health checked as green. 

11:50-IBM advised s 47E(d) health check was still ongoing. 

12:47-IBM advised Smartgate Arrivals and Departures,s- 47 ^® health 

check was all fine. Everything seemed to be running except for S-47E * < ® which was still being 
investigated. 

12:53 - IBM advised Smartgate replication is down from what it was but is almost up to date. 

Cargo was back up and up to date with no backlogs. 

% r ,.. replication was still slightly behind, however it was still flowing through and 

4/L«J) 

decreasing. 

was cor| fi rme d fine by Mary Stewart. 

s.47E(d) was b ac |< tQ same performance (degrade) as last week, which was lodged under 
separate priority 1 incident*' 47 ^ 

There were no other reports of any other ongoing issues. 

13:42 - MIM placed priority 1 incident into monitoring stage to confirm stability and continued 
communications between business stakeholders and IT. 

14:57 - MIM placed priority 1 incident into monitoring for the next 24 hours to further confim 
stability. ABOC were notified and agreed with this action. 

30/04/2019 15:05 - MIM resolved priority 1 incident as IT and business stakeholders had ^dyifed al 
affected systems were running as intended with no issues reported. 
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Root cause was a Hardware Failure on 5 ‘ ,7E<d) Distribution Switch 1, which connects Distribution Sjwitch 
1 to the Edge Switch. 

Network traffic should have failed over from 5 ' 47E,d| Distribution Switch 1 to Distribution Switch 2dtS > 11 
the Edge Switch. This failover did not occur successfully because the static route was missing on ttote 
Edge Switch to route traffic to and from Distribution Switch 2. 
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Has the issue occurred before? Are there 
any noticeable trends or Patterns? 

No. 

What is the likelihood of this issue re¬ 
occurring? 

Low. 

What stakeholders need to be engaged 
for Root Cause Analysis (include any 
Business Reps that should be included)? 

IBM Network team have identified the root cause as above. 

Is Incident related to an existing 

Problem or Known Error? 

No. 

Reference Number: 


Service Improvement Activities: 

Have any service improvement activities 
been identified. If so please provide 
detail? 

Short Term: 

Faulty hardware (Line Card 9 of 5 ' 47E(d, Distribution Switch 01) needs to be replaced. 

Vlans and access ports need to be reactivated on s ' 47E<d) Distribution Switch 01. 

Once the faulty hardware has been replaced, repatch cables back from Line Card 8 
(functional) to Line Card 9 (currently down) of s ’ 47E<d> Distribution Switch 01. 

The static route of Edge Switch should be added for 5 ' 47E(l11 Distribution Switch 02. 

Complete fail over testing of s ' 47E * d) Distribution Switch 
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Australian Government 



Department of Home Affairs 


For-Official-Use-Only 

Major Incident Review - Thursday, July 25 2019 



- TRIPS/Mainframe - Multiple Services Impacted 


Priority: 


PI 


6 hours 20 minutes 


Outage Period 


Description: 


On Monday July 15, ABF reported an issue with all Departure SmartGates nationwide due to an 'unexpected movement’ 
error. This resulted in all passengers being referred to the primary line, causing severe delays in passenger processing. 

The incident was first reported at 0433hrs and 4 Priority 3 tickets were raised within the next hour. Given the business 
impact, Unisys Major Incident Major (MIM) escalated the incident to a Priority 2 at 0537hrs. A separate Priority 2 
incident was raised at 0550hrs regarding]j 7E(d) being unavailable, however, both in-flight Priority 2 incidents were closed 
and managed under a Priority 1 incident once ICT Border Mainframe confirmed that SmartGates, S - 47EW) were all 

experiencing performance issues. 

The issue has been attributed to an authorised change (C4534173) that caused an ICT Border Mainframe communication 
device (BROKER) to fail, causing^ 7E(d) processing to queue at the mainframe. The change was deployed on Monday July 
15 at OOOlhrs and implementation concluded at 0400hrs. 

At 0729hrs, IBM advised that they had successfully restarted BROKER and ICT Border Mainframe confirmed 4 7E(d) was 
now available 8 47E(d) was working as intended. There was a backlog of 30,400 Expected Movements that needed to be 
processed and by 1009hrs, technicians advised that the data had been loaded into 8 ' 47E(d) However, there was still 30,000 
jobs that needed to be transferred t 0 4 7E(d) to resolve the issue. At 1304hrs, ICT Border Mainframe confirmed that the 
backlog had been cleared and once the ABF confirmed that all airports were operating as intended, the major incident 
was resolved at 1317hrs. 


Root cause for the BROKER failure is not yet known, however, logs from the BROKER have identified that there was some 
communication issue between the Adabas processes and/or between the client/server. There is ongoing dialogue 
between IBM and the vendor, Software AG, who have advised that there is a fix that has been released in June for] this 
BROKER issue. Problem Management investigations are continuing. 

O 

Post major incident declaration, the incident was handled effectively until it was resolved. 


Business impact: 


:uaa 


Between 0517-1300hrs, all Departure SmartGates Nationwide were impacted as expected movement data was 
unavailable. All passengers had to be referred to the primary line for manual processing, causing severe delays in 
passenger processing. 
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Multiple systems, including 8 1 ’ , had limited to no functionality due to data not flowing through to the 

Mainframe, which impacted the ability to process Visa applications and 8 47E(d) between 0526- 

0729hrs. 


Key points of note: 


The issue was first reported to the IT Service Desk at 0433hrs and a subsequent call was made to Mainframe Openation^f, 
on-call, however, only 1 attempt was made with no further follow-up or escalation per the on-call escalation list. If vags f- 
recommended for Unisys to review the initial ticket and due process around contacting. This may have been a missed 
opportunity to identify a major incident earlier, therefore, potentially reducing the backlog of Expected Movements post¬ 
remediation. 

CD 


CD c 

DU 3 




























Australian Government 



Department of Home Affairs 


In terms of monitoring/tracking, IBM have solutioned an alert situation for the CICS timeout message in^ 7E(d that will be 
implemented on 31 July 2019. ICT Mainframe Midrange Database will also look into developing a heartbeat between 
that will exercise BROKER and NAT RPCs to identify workflow. 

In regards to preventative measures, ICT Mainframe Midrange Database advised that the system needs be shut down 
and restarted automatically during every maintenance window for associated changes. An expectation statement 
confirming the process that is to be followed when implementing changes during a maintenance window will be 
provided to establish this. 

Lastly, a further meeting will be scheduled to clarify the roles and responsibilities across IBM, ICT Mainframe Midrange 
Database and ICT Border Mainframe. 


Communication back to business: 


From a business impact perspective, re-validation of the period used for change outage windows may need to be 
considered to take into account the tolerances for processing of passengers at airports. It was raised that Immigration 
Systems Management should also be included in these discussions given the potential impacts s ' 4 ;j 7E(d) 


Post-incident action items 


A.l 

Review the incident timeline to confirm what actions 
occurred between 0652-0710hrs and advise. 

IBM 

Due 26/07/2019 
- PENDING 


A.2 

Clarify the context of the PIR/MIR processes to ensure 
that any preceding P3/P4s are documented and made 
available to MIR attendees. 

ICT Service Delivery 
Management 

Due 2/08/2019 
- PENDING 


A.3 

Review ticket SD29014535 and due process around 
contacting: 

- Was this an opportunity to identify a major incident 
earlier if Mainframe contact had been made? 

Unisys 

Due 2/08/2019 
- PENDING 


A.4 

Investigate putting in an alert situation for the CICS 
timeout message in^ 7E(d (this to be included with 
other problem management tasks due on 8/08). 

IBM 

Due 8/08/2019 
- PENDING 


A.5 

Review existing change outage window timings to 
establish if alternate windows should be considered. 

ABF/Visa Processing 
Teams 

Due 29/07/2019 
- PENDING 


A.6 

Review backlog processing with ICT Border 

Mainframe, ITCAPM and IBM. 

Traveller Systems 

Taken on notice 


A.7 

Explore the feasibility of flight selector in emergency 
situations. 

Traveller Systems/ABF 

Taken on notice 


A.8 

Create a heartbeat between s ' 47E(d> that will 

ICT Mainframe 

Due 9/08/2019 

I'co H 


exercise Broker and NAT RPCs to identify workflow. 

Midrange Database 

-PENDING 

£ o 

A.9 

Formulate an expectation statement confirming the 

ICT Mainframe 

Due 2/08/2019 

ome / 

tion A 


process that is to be followed when implementing 
changes during a maintenance window. 

Midrange Database 

- PENDING 

A. 10 

Establish a process that ensures change 

IBM/ICT Mainframe 

Due 2/08/2019 

|f 


implementation plans are communicated prior to 
deployment. 

Midrange Database 

- PENDING 

A.ll 

Organise a meeting between the teams to clarify of 

ICT Service Delivery 

Due 2/08/2019 

<D J 


roles and responsibilities across IBM, ICT Mainframe 
Midrange Database & ICT Border Mainframe. 

Management 

- PENDING 


MIR Invitees/Attendees 
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Australian Government 


Department of Home Affairs 


S 22(1)(a)(ii) 


ICT Service Management 
Unisys MIM 

ICT Service Management 

ICT Change Management 

ICT Problem Management 

ICT Service Delivery Management 

ICT Service Management Officer 

IBM 

ICT Mainframe Midrange Database 
IBM 

Mainframe Midrange Database 
IBM 

Vendor & MOU Management 

Vendor & MOU Management 

Traveller Systems 

Mainframe Midrange Database 

Mainframe Midrange Database 

ICT Capacity & Performance Management 

ICT Capacity & Performance Management 

IBM 

IBM 

IBM 

Traveller Systems 

ICT Problem Management 


Unisys Service Desk Management (did not attend) 


Author 


Name 

Title 

,™o 


ICT Service Delivery Management 


For further information in relation to the associated incident, please refer to the attached. 
1. Unisys PIR 
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Incident Summary: 

Post Incident Report 


Brief Incident description: 

47 E(d) /Mainframe - Multiple Services Impacted 

Incident Number: 

s. 47E(d) 

External Reference Numbers: 

IN1789042 - IBM 

Priority Level: 

Priority 1 

PIR Author: 

s.22(1)(aXii) | | - MIM 

Actual downtime of Incident: 

6 Hours 20 Minutes 

PIR Review & Input provided 

by: (Org/Name/Section) 

s^lKaXil) _ Border 

Mainframe 

aatnw® _ |BM 

Date/Time of Incident Recorded: 

Jul 15, 2019 06:56 

Resolving Group: 

Border Mainframe 

Date/Time of Incident Resolution: 

Jul 15, 2019 13:16 

No. of Users affected: 

Unknown 

Environment impacted: 

E9 - Production 

Related change number if 
applicable: (please provide 
number and title) 

N/A 

PIR Request Date: (If P2/3/4 Incident) 

Jul 15, 2019 13:16 

Method of Detection: 

Reported by Sydney Airport 

Service / System / Device affected: 


Geographic Location: 

Australia 

Actual Business impact: 

47E(d) c * ata sto PP e d flowing causing the following impacts: 

Between the hours of 05:17 and 13:00 all Departure SmartGates Nationwide were 
impacted, expected movement data was unavailable and referring all passengers to the 
primary line for manual processing, causing severe delays in passenger processing. 

Multiple systems including s ‘ 47E(d) had limited to no functionality due to data not 

flowing through to the Mainframe, this impacted the ability to process Visa applications and 

between 05:26 and 07:29. 

Business and System Owner: 

(business owner), s-^OXaX") (system owner) 

Incident Timeline & Resolution: 
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What steps were performed to restore 
services? 

What specific action was taken to 
restore services? 


00:01-Change C4534173 commenced. 

01:20- Upgrade complete. Adabas is up without CICS. That will be delayed until after initial 
BVT. 


01:30 — Initial TVT of upgrade was conducted, but this is fairly limited and databases showed 
no errors. 

01:34 — IBM technician investigating DBPL010 problem. 

02:30 - IBM technician resolved DBPL010 problem. 

02:41 - BVT advised initial testing successful. 

03:00 - Databases restarted through normal start up process. 

03:20 - Online system made available and further upgrade testing was conducted, without 
any errors detected. 

03:20 — 47 E(d) batch jobs also started with some being successful but others failing. The Visa 
Send job is a crucial HL. job and it had a load library error. 

03:20 Referrals Visa Load jobs had similar failures. 

03:50 - Visa send job issue was resolved. 

04:56 — 47 E(d) on-call was able to rerun Visa Load job successfully. 

04:54 — Service Desk (SD) raised P3 incident IM4594672 regarding Sydney receiving the "No 
Expected Movement" error on Departure Smartgates. 

04:57 -SD engaged 4 7E(d) on-call technician to notify of issue. 

05:17 -SD engaged MIM to advise of P 3 s47E(d) 

05:20 - SD engaged MIM to advise Melbourne Airport was receiving the same error 
message. 

05:26-SD engaged MIM advising users were unable to access^ 7E(d) and raised P3 incident 
IM4594673. 


on-call who advised the link between S - 47EW) 


05:27 - MIM engaged 47E(d) 
functioning correctly. 

05:30 - MIM engaged Border Mainframe to investigate. 

05:30 - SD engaged MIM advising users were unable to search in 


is not 


s. 47E(d) 


, s 47E(d) 


the application 


bat fix and test in Citrix and advise 


would freeze - MIM requested they run the 
of results. 

05:37 - MIM raised P2 incident IM4594674. 

05:40 - MIM re-engaged Border Mainframe technician who advised there is a delay in the 
EMR that would be causing the error message. 

05:40 - Border Mainframe technician confirmed 4 7E(d , was completely unavailable and this 
was already being investigated. 

05:50-MIM raised P2 incident IM4584675 regarding*^, being unavailable. 

06:22 - Border Mainframe identified the issue was caused by a BROKER on the Mainframe 
had failed causing a halt to s ' 47E ^^Hi. 

06:40 - SD engaged MIM to advise multiple users were unable to search in s ' 47E(d) 
application was freezing when attempting to do so. 


the 
CM 

, , CO 

06:42 - Border Mainframe technician advised an IBM technician was engaged to help 
investigations. 

support advised IBM to stop & start BROKER. 

MIM engaged IBM for reference number - IN1789042. 


06:50 -4 7E(()) 


IBM logged into OPSMVS on IMPA and attempted stop via OPSMVS. 


06:51 
06:52 

06:55 - Border Mainframe technician confirmed SmartGate, s ' 47E(<,> 
related due to the fault with the BROKER. 

06:56 - MIM raised PI incident IM4564683. 

06:58 - BROKER is not stopping - IBM checked via SDSF and could not see a Stop 


CO 

< ^ 
E § 


issue \)vere all 

£ g 

° P 

issued. 


IBM called 


support asking if IBM could cancel 


s. 47E(d) 


support confirmed, 


06:59 - MIM closed P2 IM4594675 and IM4594674 to consolidate and Manage under PI 


IM4564683. 

07:00 - IBM attempted cancel via SDSF. 

07:05 - BROKER still not coming down. 

07:10- IBM cancelled BROKER using address space. 
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07:12 - BROKER restarted and now back up processing. At this point other 4 7E(d) batch jobs 
which were stalled commenced running again. 

07:14 - IBM contacted MIM advised that the;j 7E(d) connections are showing down in 
Sydney but all other Airports seem to be up and running and shouldn't be having issues. 

IBM technician advised this would need to be investigated by the application team (Border 
Mainframe) 

07:25 - MIM engaged BOC to validate widespread impact given IBM's update. BOC advised 
all Airports are still having issues. 

07:29 - Border Mainframe technician advised they had successfully restarted BROKER. After 
restarting BROKER, ®' 7E(d) was available s ' 47E(d) was working as intended. Border Mainframe 
advised there was a backlog of 30,400 Expected Movements to be processed. 

08:13 - Border Mainframe technician advised backlog had dropped to 27,500 Expected 
Movements. 

09:00 — 4 7E ( d ) backlog of expected movements was cleared but airport departure gates 
were still affected because the records also need to be loaded into^ 7E!d) 

09:07 - Border Mainframe technician advised the backlog of Expected movements had been 
loaded in to s,47E(d) 

09:28 - MIM engaged Sydney Control Room to test Smartgates given the data was believed 
to have cleared. Sydney Airport staff advised the issue is still ongoing. 

09:30 - All parties had attended DORM, it was revealed that change C4534173 impacted 
BROKER communications. 

09:58 - MIM engaged Melbourne Airport Control Room to test Smartgates given the data 
was believed to have cleared. Sydney Airport staff advised the issue is still ongoing. 

09:59 - MIM re-engaged Border Mainframe technician to advise the error message was still 
being received. 

10:09 - Border Mainframe technician advised that the data had been loaded int 0 4 7E(d) but 
at this current time there was still 30,000 jobs that needed to be transferred to;j 7E(d) to 
resolve this issue. 

10:59 - Backlog at approximately 23,000 

11:39 - Backlog at approximately 15,000 

12:24 - Backlog at approximately 8,000 

12:46 - Backlog at approximately 5,500 

13:04 - Border Mainframe confirm that backlog has cleared. 

13:05 - MIM engaged ABF - Advised will contact ABOC and instruct to contact Airports to 
test. 

13:12 - ABF confirmed all Airports operating as per intended. 

13:17 - MIM resolved PI incident IM4594683. 





Was Root Cause identified during the 
incident? If so please provide detail. 


1. A number of batch jobs were failing due to the load library configuration. 

• There was no business impact due to this issue. The errors were detected and 
fixed during the midnight to 4:00am change window. 

• The ADABAS upgrade required a new load library (vendor supplied software 
modules) to be included for batch jobs. IBM updated the standard procedure/script to 
include the new load library. Some batch jobs, even though they referenced the standard 
procedure did not pick up the new load library and were looking for the previous version. 

2. Lack of connectivity between key systems, 8 ' 

• The software component called BROKER was malfunctioning and this was the root 
cause of the issues. The SAGRPCP task was not connected to the BROKER. 

Note: The IBM draft root cause analysis is attached. 

• This issue resulted in significant business impact causing departure SmartGates to 
refer all passengers to the primary line. 

• BROKER provides connectivity between the key system s s ' 47E(d) 

. It is a 3rd party software product provided by Software AG, who also provide 
and support the ADABAS software. 

• BROKER appeared to be up and active following the change. Details are referenced 
in the console snapshot in IBM's report. 

• IBM have obtained the error log from BROKER. The error log was sent to Cyber 
Security on the 16th of July, approved on the 17th of July and released to Software AG for 
their analysis. 


Was the resolution temporary or 
permanent? 


Permanent 


Has the issue occurred before? Are there 
any noticeable trends or Patterns? 


No 


What is the likelihood of this issue re¬ 
occurring? 


Unlikely 


What stakeholders need to be engaged 
for Root Cause Analysis (include any 
Business Reps that should be included)? 


IBM ADABAS DBAs 
Traveller systems 


Is Incident related to an existing 
Problem or Known Error? 


Yes 


Reference Number: 


PM4001075 


Service Improvement Activities: 

Have any service improvement activities 
been identified. If so please provide 
detail? 


Refine checklist of post upgrade to include checks in BROKER for any error;. 

Include reminders in the checklist to contact the MIM and other affected parties f 

there are problems so that appropriate information can be relayed to stakeholders. 

I TO 

During BVT ensure all associated systems are operating as intended. 


Complete and signoff on BVT. 

Review monitoring 0 f s 47E(d) dependant systems. 

Review back out plan and impact of^y^ changes. 

Next 47 E(d) outage window is the 19 th August, It is expected that all monitoring 
recommendations are implemented prior to this date. 
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Incident Summary: 

Post Incident Report 


Brief Incident description: 

TRIPS/Mainframe - Multiple Services Impacted 

Incident Number: 

s. 47E(d) 

External Reference Numbers: 

IN1789042 - IBM 

Priority Level: 

Priority 1 

PIR Author: 

s 22(1 )(a)(ii) _ M | M 

Actual downtime of Incident: 

6 Hours 20 Minutes 

PIR Review & Input provided 

by: (Org/Name/Section) 

s.22(1Xa)(ii)_ Border 
Mainframe 

s. 22(1 )(a)(ii) . | BM 

Date/Time of Incident Recorded: 

Jul 15, 2019 06:56 

Resolving Group: 

Border Mainframe 

Date/Time of Incident Resolution: 

Jul 15, 2019 13:16 

No. of Users affected: 

Unknown 

Environment impacted: 

E9 - Production 

Related change number if 
applicable: (please provide 
number and title) 

N/A 

PIR Request Date: (If P2/3/4 Incident) 

Jul 15, 2019 13:16 

Method of Detection: 

Reported by Sydney Airport 

Service / System / Device affected: 


Geographic Location: 

Australia 

Actual Business impact: 

TRIPS data stopped flowing causing the following impacts: 

Between the hours of 05:17 and 13:00 all Departure SmartGates Nationwide were 
impacted, expected movement data was unavailable and referring all passengers to the 
primary line for manual processing, causing severe delays in passenger processing. 

Multiple systems including 5 - 47E(d) had limited to no functionality due to data not 

flowing through to the Mainframe, this impacted the ability to process Visa applications and 
passenger risk assessments between 05:26 and 07:29. 

Business and System Owner: 

s. 22(1)(a)(ii) (business owner), s -22(1)( a )0') (system owner) 

Incident Timeline & Resolution 

: 
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What steps were performed to restore 
services? 

What specific action was taken to 
restore services? 


00:01-Change C4534173 commenced. 

01:20- Upgrade complete. Adabas is up without CICS. That will be delayed until after initial 
BVT. 


01:30 — Initial TVT of upgrade was conducted, but this is fairly limited and databases showed 
no errors. 

01:34 — IBM technician investigating DBPL010 problem. 

02:30 - IBM technician resolved DBPL010 problem. 

02:41 - BVT advised initial testing successful. 

03:00 - Databases restarted through normal start up process. 

03:20 - Online system made available and further upgrade testing was conducted, without 
any errors detected. 

03:20 — 47 E(d) batch jobs also started with some being successful but others failing. The Visa 
Send job is a crucial H& j» job and it had a load library error. 

03:20 - 47 E( and Referrals Visa Load jobs had similar failures. 

03:50 - Visa send job issue was resolved. 

04:56 — 47 E(d) on-call was able to rerun Visa Load job successfully. 

04:54 — Service Desk (SD) raised P3 incident s ' 47E<d, regarding Sydney receiving the "No 

Expected Movement" error on Departure Smartgates. 

04:57-SD engaged 4 7E(d) on-call technician to notify of issue. 

05:17-SD engaged MIM to advise of p 3 s47E(d) 

05:20 - SD engaged MIM to advise Melbourne Airport was receiving the same error 
message. 

05:26-SD engaged MIM advising users were unable to accessand raised P3 incident 

s. 47E(d) 


is not 


05:27 — MIM engaged^ 7E((J) on-call who advised the link between s ' 47E(d) 
functioning correctly. 

05:30 - MIM engaged Border Mainframe to investigate. 

05:30 - SD engaged MIM advising users were unable to search in 4 7E(d and the application 
would freeze - MIM requested they run the s ' 47E(d) bat fix and test in Citrix and advise 
of results. 

05:37 - MIM raised P2 incident IM4594674. 

05:40 - MIM re-engaged Border Mainframe technician who advised there is a delay in the 
EMR that would be causing the error message. 

05:40 - Border Mainframe technician confirmed 4 7E(d , was completely unavailable and this 
was already being investigated. 

05:50 - MIM raised P2 incident s47E(d) regarding 4 ^, being unavailable. 

06:22 - Border Mainframe identified the issue was caused by a BROKER on the Mainframe 
had failed causing a halt to 4 ^^ data. 

06:40 - SD engaged MIM to advise multiple users were unable to search in 47 E(0 and the 
application was freezing when attempting to do so. 

06:42 - Border Mainframe technician advised an IBM technician was engaged to help 
. . . — t— 

investigations. 

06:51 - MIM engaged IBM for reference number - s 47E(d) 

06:55 - Border Mainframe technician confirmed SmartGate, s47E(d) | issue we^p af 

related due to the fault with the BROKER. 

06:56 - MIM raised PI incident*- 4TEf ® 

06:59 - MIM closed P2*' 47E(d) and s 47E(d) to consolidate and Manage unSer 

s. 47E(d) 1 •• ^ 

07:12 - BROKER restarted after many difficulties in restarting. At this point otherE 
batch jobs which were stalled commenced running again. 

07:14 - IBM contacted MIM advised that the® 7 CM4 connections are showing down in 


47E(d) 




Sydney but all other Airports seem to be up and running and shouldn't be having issues;- 
IBM technician advised this would need to be investigated by the application team {Border 


<D 


"D 

0 > 


Mainframe) 

07:25 - MIM engaged BOC to validate widespread impact given IBM's update. BC t advised 
all Airports are still having issues. 













07:29 - Border Mainframe technician advised they had successfully restarted BROKER. After 
restarting BROKER, 4 7E(d) was available and 4 7E(d was working as intended. Border Mainframe 
advised there was a backlog of 30,400 Expected Movements to be processed. 

08:13 — Border Mainframe technician advised backlog had dropped to 27,500 Expected 
Movements. 

09:00 — 4 ^( d j backlog of expected movements was cleared but airport departure gates 
were still affected because the records also need to be loaded int 04 7E(d) 

09:07 - Border Mainframe technician advised the backlog of Expected movements had been 
loaded in to* *" 4 ^® 

09:28 - MIM engaged Sydney Control Room to test Smartgates given the data was believed 
to have cleared. Sydney Airport staff advised the issue is still ongoing. 

09:30 - All parties had attended DORM, it was revealed that change C4534173 impacted 
BROKER communications. 

09:58 - MIM engaged Melbourne Airport Control Room to test Smartgates given the data 
was believed to have cleared. Sydney Airport staff advised the issue is still ongoing. 

09:59 — MIM re-engaged Border Mainframe technician to advise the error message was still 
being received. 

10:09 - Border Mainframe technician advised that the data had been loaded int 0 4 7E(d) but 
at this current time there was still 30,000 jobs that needed to be transferred to| 7E(d) to 
resolve this issue. 

10:59 - Backlog at approximately 23,000 

11:39 - Backlog at approximately 15,000 

12:24 - Backlog at approximately 8,000 

12:46 - Backlog at approximately 5,500 

13:04 - Border Mainframe confirm that backlog has cleared. 

13:05 - MIM engaged ABF - Advised will contact ABOC and instruct to contact Airports to 
test. 

13:12 - ABF confirmed all Airports operating as per intended. 

13:17 - MIM resolved PI incident s_47Em 


IVos Root Cause identified during the 
incident? If so please provide detail. 


1. A number of batch jobs were failing due to the load library configuration. 

• There was no business impact due to this issue. The errors were detected and 
fixed during the midnight to 4:00am change window. 

• The ADABAS upgrade required a new load library (vendor supplied software 
modules) to be included for batch jobs. IBM updated the standard procedure/script to 
include the new load library. Some batch jobs, even though they referenced the standard 
procedure did not pick up the new load library and were looking for the previous version. 

2. Lack of connectivity between key systems, s ' 47E(d> 

• The software component called BROKER was malfunctioning and this wastberocfi 
cause of the issues. The SAGRPCP task was not connected to the BROKER. 

^ vJ 

Note: The IBM draft root cause analysis is attached. 

• This issue resulted in significant business impact causing departure Smart(jiateQo 
refer all passengers to the primary line. 

• BROKER provides connectivity between the key systems s 47E(d) 

. It is a 3rd party software product provided by Software AG, who also provide 



and support the ADABAS software. 

• BROKER appeared to be up and active following the change. Details are referenced 
in the console snapshot in IBM's report. 

• IBM have obtained the error log from BROKER. The error log was sent to Cybep 
Security on the 16th of July, approved on the 17th of July and released to Software TSG for 
their analysis. 















Was the resolution temporary or 
permanent? 

Permanent 

Has the issue occurred before? Are there 
any noticeable trends or Patterns? 

No 

What is the likelihood of this issue re¬ 
occurring? 

Unlikely 

What stakeholders need to be engaged 
for Root Cause Analysis (include any 
Business Reps that should be included)? 

IBM ADABAS DBAs 

Traveller systems 

Is Incident related to an existing 

Problem or Known Error? 

Yes 

Reference Number: 

PM4001075 

Service Improvement Activities: 

Have any service improvement activities 
been identified. If so please provide 
detail? 

• Refine checklist of^ (d) post upgrade to include checks in BROKER for any errors. 

• Include reminders in the checklist to contact the MIM and other affected parties if 
there are problems so that appropriate information can be relayed to stakeholders. 

• During BVT ensure all associated systems are operating as intended. 

• Complete and signoff on BVT. 

• Review monitoring of 4 7E(d) and dependant systems. 

• Review back out plan and impact of^ 7E(d) changes. 

• Next 4 7E(d) outage window is the 19 th August, It is expected that all monitoring 
recommendations are implemented prior to this date. 
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Australian 

BORDER FORCE 


**Please update the classification of the document if information provided in the 

background is above FOUO** 


MEDIA ENQUIRY 



Subject: Sydney Airport Delays 


Deadline: ASAP 

s. 47F(1) 


Enquiry Received (Time & Date): 8:27am 15 July 2019 


Media Officer: s 22(1)(a)(ii) Media Ph: 02 6264 2244 


QUESTION / ISSUE 

I understand there's significant delays at Sydney airport (and possibly nationwide) with security 

and e-gates systems down. 

Can I receive a statement and more information on what these delays ASAP? 

RESPONSE UNCLASSIFIED 

• A number of Australian Border Force (ABF) and Department of Flome Affairs IT systems 
impacted by an earlier outage have now been restored. 

• The Department is continuing work to bring all systems back online, ensure the integrity 
of the systems and resolve any ongoing issues. 

• Additional ABF staff have been deployed to process passengers at international airports 
and to minimise delays in cargo processing where possible. 

• While the addition of staff has seen reduced delays at some airports, passengers are still 
encouraged to arrive at airports early to allow additional time for processing. 

• Cargo processing is continuing, though some delays can be expected as staff work 
through the backlog. 

• We appreciate the patience of passengers and businesses impacted by these outages. 

BACKGROUND [ not for public release) 

The information below is classified and should not be publicly released without the authority of the 

Australian Border Force. 


A short, unclassified brief providing background/context to the incident/issue/event which 
may not be clear from the rest of the document; the background must detail actions 
taken by agency/departments/other stakeholders in the information environment, 
propaganda by adversaries/interest groups and highlight sensitive considerations. 


FOR OFFICIAL USE ONLY 
















FOR OFFICIAL USE ONLY 


The background may point to further correspondence on a higher classification system if 
required. 


CLEARANCE: 


Drafted by 

Title 

Time/Date drafted 



Time DD Month 2019 


Cleared by 

Title 

Time/Date cleared 

Full Name 

Position 

Time DD Month 2019 



Time DD Month 2019 

s. 22(1 )(a)(ii) 

Director, ABF Media 

Time DD Month 2019 

Tony Smith 

CoS to Commissioner 
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From: 

ABF Media 

To: 


Cc: 

ABF Media 

Subject: 

RESPONSE: Airport systems [SEC=UNCLASSIFIED] 

Date: 

Monday, 29 April 2019 3:04:11 PM 


UNCLASSIFIED 

Good afternoon S ' 47F(1) 

Please see below an updated statement. Please attribute to an Australian Border Force 
spokesperson. 

A number of Australian Border Force (ABF) and Department of Home Affairs IT systems impacted 
by an earlier outage have now been restored. 

The Department is continuing work to bring all systems back online, ensure the integrity of the 
systems and resolve any ongoing issues. 

Additional ABF staff have been deployed to process passengers at international airports and to 
minimise delays in cargo processing where possible. 

While the addition of staff has seen reduced delays at some airports, passengers are still 
encouraged to arrive at airports early to allow additional time for processing. 

Cargo processing is continuing, though some delays can be expected as staff work through the 
backlog. 


We appreciate the patience of passengers and businesses impacted by these outages. 


Thank you, 


Australian Border Force 

Media & Communications 
Media line: 02 6264 2211 
E: media@abf.gov.au 

UNCLASSIFIED 
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From: ABF Media 

To: s.47F(i) 

Cc: ABF Media 

Subject: RESPONSE: UPDATED: Media enquiry: verifying cause and attribution of Monday ABF / Smartgate outage 

[SEC=UNCLASSIFIED] 

Date: Tuesday, 30 April 2019 4:53:48 PM 

Attachments: imaaeOOl.Dna 


UNCLASSIFIED 

Good afternoon, 

Please attribute the following to a spokesperson from the Australian Border Force (ABF). 

All IT systems are now back online and the Department and ABF are continuing to test and 
monitor systems to prevent further issues. 

Passenger processing is occurring as normal. The ABF has also deployed additional resources to 
ensure cargo is cleared in a timely manner. 

The outage of IT systems yesterday was not directly related to Smartgates, but was linked to a 
back end network issue that impacted a number of systems, including those used to process 
passengers and cargo. 

Regards 

Media Operations 
Australian Border Force 
Media line: 02 6264 2211 
E: media(S»abf.gov.au 
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From: 

To: 

Cc: 

Subject: 

Date: 

Attachments: 



ABF Media 

s. 22(l)(a)(ii) 
s. 22(1 )(a)(ii) 


; ABF Media 


RE: FOR INPUT/CLEARANCE: SECURITY DELAYS - AIRPORT [DLM = For-Official-Use-Only] 

Monday, 15 July 2019 9:28:37 AM 

190715 EN Sydney Airport Delays Various.docx 


For-Official-Use-Only 
Hi all, 

Please see the below for lines we used while the previous incident was ongoing: 

• A number of Australian Border Force (ABF) and Department of Home Affairs IT 
systems impacted by an earlier outage have now been restored. 

• The Department is continuing work to bring all systems back online, ensure the 
integrity of the systems and resolve any ongoing issues. 

• Additional ABF staff have been deployed to process passengers at international 
airports and to minimise delays in cargo processing where possible. 

• While the addition of staff has seen reduced delays at some airports, passengers 
are still encouraged to arrive at airports early to allow additional time for 
processing. 

• Cargo processing is continuing, though some delays can be expected as staff work 
through the backlog. 

• We appreciate the patience of passengers and businesses impacted by these 
outages. 

Grateful if you can advise if you are happy with the above and they are more relevant to this 
particular issue? 

Kind regards, 

S. 22(1 )(a)(ii) 

Public Affairs Officer, Media Operations 

Media & Engagement Branch | Executive Coordination 

Department of Home Affairs 
Media line: 02 6264 2244 P: s ' 22(1)(a)(ii) 

E: media@homeaffairs.gov.au 


For-Official-Use-Only 














From: ABF Media <media@abf.gov.au> 

Sent: Monday, 15 July 2019 9:20 AM 
y o: s. 22 (i)(a)(n) @homeaffairs.gov.au> 


Cc: Media Operations <media@homeaffairs.gov.au>; s ' 22(1)(a)(ll) 



s. 22(1 )(a)(u) @ homeaffairs.gov.au>; s - 22(1)(a)(ll) 

@homeaffairs.gov.au>; 

ABF Media <media@abf.gov.au>; s ' 22(1)(a)( " ) 

@ homeaffai rs.gov.a u> 


Subject: FOR INPUT/CLEARANCE: SECURITY DELAYS - AIRPORT [DLM=For-Official-Use-Only] 


For-Official-Use-Only 


We have received multiple enquiries about alleged delays at Sydney Airport due to smart gate 
outages. 

Grateful if you can provide any available information on this topic and advise if we are able to 
reuse the same lines as last time we had a delay? 


• All IT systems are now back online and the Department and ABF are continuing to 
test and monitor systems to prevent further issues. 

• Passenger processing is occurring as normal. The ABF has alsovaness deployed 
additional resources to ensure cargo is cleared in a timely manner. 

• The outage of IT systems yesterday was not directly related to Smartgates, but was 
linked to a back end network issue that impacted a number of systems, including 
those used to process passengers and cargo. 


Kind regards, 


S. 22(1 )(a)(ii) 

Public Affairs Officer, Media Operations 

Media & Engagement Branch | Executive Coordination 

Department of Flome Affairs 
Media line: 02 6264 2244 P: s ' 22(1)(a)(ii) 

E: media(5)homeaffairs.gov.au 


For-Official-Use-Only 
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